This document was created from an R markdown file. The repository for the project can be found here. The data reported in the paper can be explored interactively at the Metalab website.
| Moderator | Estimate | z value | p value |
|---|---|---|---|
| Mean Age | 0 [0, 0] | -1.40 | 0.16 |
| Median Productive Vocabulary Size | -0.01 [-0.02, 0] | -1.74 | 0.08 |
| Predicate Type (Transitive / Intransitive) | 0.27 [0.05, 0.5] | 2.36 | 0.02 |
| Agent Argument Type (Pronoun / Noun) | 0.15 [-0.21, 0.52] | 0.82 | 0.41 |
| Moderator | Estimate | z value | p value |
|---|---|---|---|
| Number of Sentence Repetitions | 0.01 [-0.02, 0.03] | 0.43 | 0.67 |
| Stimuli Synchronicity (Simultaneous / Asynchronous) | 0.1 [-0.21, 0.41] | 0.65 | 0.52 |
| Character Identification Phase (Yes / No) | 0.19 [-0.24, 0.62] | 0.85 | 0.39 |
| Practice Phase (Yes / No) | -0.21 [-0.5, 0.09] | -1.39 | 0.17 |
| Testing Structure (Mass / Distributed) | 0.34 [-0.08, 0.76] | 1.60 | 0.11 |
To standardize the effect size calculation, we converted some reported raw results to the proportion of correct responses. For looking time studies, when the paper only reported the raw looking time in seconds, we calculated the proportion of correct response by dividing the mean looking time toward the matching scene by the sum of looking time toward the matching scenes and non-matching scenes (i.e., excluding the look away time from the denominator). The raw standard deviations were also converted to the corresponding values by being divided by the sum.
Below is a step-by-step example calculation using data in Yuan & Fisher (2009) Experiment 1
| Dialogue Type | Sample Size | Two-participant Event | One-participant Event |
|---|---|---|---|
| Transitive | 8 | 4.82 (0.43) | 2.87 (0.51) |
| Intransitive | 8 | 3.33 (0.24) | 4.12 (0.40) |
When the paper only provides raw looking time data, we converted the data into proportion of correct looking time and the variances following the formulae below.
\[\begin{aligned} Mean_{Proportion} &= \frac{Time_{correct}}{Time_{correct} + Time_{incorrect}} \\ SD_{Proportion} &= \frac{SE_{Raw}}{Time_{correct} + Time_{incorrect}} * \sqrt[2]{N} \end{aligned}\]| Dialogue Type | Sample Size | Mean Proportion | Standard Deviation |
|---|---|---|---|
| Transitive | 8 | 0.627 | 0.158 |
| Intransitive | 8 | 0.553 | 0.152 |
Then we calculate Cohen’s d and the variances as follows (the implemetation of the script can be found at XXX)
\[\begin{aligned} d_{transitive} &= \frac{M_1 - M_2}{\sigma_{pooled}} \\ &= \frac{M_{correct} - M_{chance}}{\sigma_{correct}} \\ &= \frac{0.627 - 0.5}{0.158} \\ &\approx 0.79 \\ \\ \\ d_{intransitive} &= \frac{M_1 - M_2}{\sigma_{pooled}} \\ &= \frac{M_{correct} - M_{chance}}{\sigma_{correct}} \\ &= \frac{0.553 - 0.5}{0.152} \\ &\approx 0.35 \end{aligned}\] \[\begin{aligned} var(d_{transitive}) &= \frac{1}{N} + \frac{d^2}{2 * N} \\ &= \frac{1}{8} + \frac{0.79^2}{2 * 8} \\ &\approx 0.16 \\ var(d_{intransitive}) &= \frac{1}{N} + \frac{d^2}{2 * N} \\ &= \frac{1}{8} + \frac{0.35^2}{2 * 8} \\ &\approx 0.13 \\ \end{aligned}\]| Model | Intercept Estimates | Intercept z value | Intercept p value | Moderator Estimates | Moderator z value | Moderator p value |
|---|---|---|---|---|---|---|
| Null model | 0.25 [0.04, 0.45] | 2.37 | 0.02 | NA [NA, NA] | NA | NA |
| Mean age model | 0.6 [0.07, 1.13] | 2.21 | 0.03 | 0 [0, 0] | -1.40 | 0.16 |
| Median productive vocabulary size model | 0.66 [0.04, 1.28] | 2.08 | 0.04 | -0.01 [-0.02, 0] | -1.74 | 0.08 |
| Sentence structure model (transitive / intransitive) | 0.09 [-0.15, 0.34] | 0.73 | 0.46 | 0.27 [0.04, 0.5] | 2.34 | 0.02 |
| Agent argument type model (pronoun / noun) | 0.18 [-0.08, 0.44] | 1.32 | 0.19 | 0.17 [-0.21, 0.54] | 0.86 | 0.39 |
| Number of sentence repetition models | 0.2 [-0.1, 0.5] | 1.30 | 0.19 | 0.01 [-0.02, 0.03] | 0.46 | 0.65 |
| Stimuli synchronicity model (simultaneous / asynchronous) | 0.21 [-0.04, 0.45] | 1.67 | 0.10 | 0.1 [-0.22, 0.41] | 0.59 | 0.55 |
| Character identification model (yes / no) | 0.18 [-0.07, 0.44] | 1.43 | 0.15 | 0.2 [-0.25, 0.65] | 0.88 | 0.38 |
| Practice phase model (yes / no) | 0.37 [0.1, 0.64] | 2.66 | 0.01 | -0.21 [-0.51, 0.09] | -1.35 | 0.18 |
| Testing structure model (mass / distributed) | 0.14 [-0.09, 0.38] | 1.23 | 0.22 | 0.36 [-0.07, 0.79] | 1.63 | 0.10 |
| Model | Intercept Estimates | Intercept z value | Intercept p value | Moderator Estimates | Moderator z value | Moderator p value |
|---|---|---|---|---|---|---|
| Null model | 0.25 [0.06, 0.45] | 2.51 | 0.01 | NA [NA, NA] | NA | NA |
| Mean age model | 0.6 [0.08, 1.13] | 2.24 | 0.02 | 0 [0, 0] | -1.40 | 0.16 |
| Median productive vocabulary size model | 0.66 [0.04, 1.28] | 2.08 | 0.04 | -0.01 [-0.02, 0] | -1.74 | 0.08 |
| Sentence structure model (transitive / intransitive) | 0.09 [-0.15, 0.33] | 0.77 | 0.44 | 0.27 [0.05, 0.5] | 2.36 | 0.02 |
| Agent argument type model (pronoun / noun) | 0.19 [-0.06, 0.44] | 1.49 | 0.14 | 0.15 [-0.21, 0.52] | 0.82 | 0.41 |
| Number of sentence repetition models | 0.21 [-0.08, 0.5] | 1.42 | 0.16 | 0.01 [-0.02, 0.03] | 0.43 | 0.67 |
| NA | 0.21 [-0.03, 0.45] | 1.71 | 0.09 | 0.1 [-0.21, 0.41] | 0.65 | 0.52 |
| Character identification model (yes / no) | 0.2 [-0.04, 0.44] | 1.59 | 0.11 | 0.19 [-0.24, 0.62] | 0.85 | 0.39 |
| Practice phase model (yes / no) | 0.37 [0.11, 0.62] | 2.81 | 0.01 | -0.21 [-0.5, 0.09] | -1.39 | 0.17 |
| Testing structure model (mass / distributed) | 0.16 [-0.06, 0.38] | 1.39 | 0.17 | 0.34 [-0.08, 0.76] | 1.60 | 0.11 |
##
## Multivariate Meta-Analysis Model (k = 60; method: REML)
##
## logLik Deviance AIC BIC AICc
## -50.5294 101.0587 111.0587 121.4464 112.1908
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.0952 0.3085 18 no
## sigma^2.2 0.0206 0.1437 49 no
## sigma^2.3 0.0420 0.2049 60 no
## sigma^2.4 0.0420 0.2049 60 no
## factor
## sigma^2.1 short_cite
## sigma^2.2 short_cite/expt_condition
## sigma^2.3 short_cite/expt_condition/same_infant
## sigma^2.4 short_cite/expt_condition/same_infant/row_id
##
## Test for Heterogeneity:
## Q(df = 59) = 197.7455, p-val < .0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## 0.2549 0.0984 2.5892 0.0096 0.0619 0.4478 **
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Multivariate Meta-Analysis Model (k = 60; method: REML)
##
## logLik Deviance AIC BIC AICc
## -50.3113 100.6227 108.6227 116.9328 109.3634
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2.1 0.1013 0.3183 18 no short_cite
## sigma^2.2 0.1066 0.3265 59 no short_cite/same_infant
## sigma^2.3 0.0000 0.0000 60 no short_cite/same_infant/row_id
##
## Test for Heterogeneity:
## Q(df = 59) = 197.7455, p-val < .0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## 0.2531 0.1008 2.5113 0.0120 0.0556 0.4507 *
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The plot below shows a modified funnel plot, or “significance funnel” where significant studies are shown in orange and non-significant studies are shown in grey. The x-axis shows effect size estimates, and the y-axis shows estimated standard error for each estimate. Studies lying on the grey line have a p-value of .05. The black diamond shows the meta-analytic effect size estimate for all studies; the grey diamond shows the meta-analytic effet size estimate for significant studies only (the “worst-case” publication scenario). Note that the worst case scenario appreciable attenuates the effect size estimate, but does not attentuate the point estimate to 0 (worst case estimate: 0.06 [-0.13, 0.24]).
The heatmaps below showed the overlappings between moderators. Each cell corresponds to the co-occurrence between two moderator levels. Brighter colors indicate a higher frequency of co-occurrence, and darker colors indicate lower frequency. You can hover your mouse on the heatmap to see the corresponding value and combination of each cell.
| Parameter | Estimates | z value | p value |
|---|---|---|---|
| Intercept | 0.03 [-0.88, 0.93] | 0.06 | 0.95 |
| Character Identification Phase (Yes / No) | 0.26 [-0.28, 0.8] | 0.94 | 0.35 |
| Practice Phase (Yes / No) | -0.11 [-0.45, 0.23] | -0.62 | 0.54 |
| Stimuli Synchronicity (Simultaneous / Asynchronous) | 0.26 [-0.28, 0.81] | 0.95 | 0.34 |
| Testing Structure (Mass / Distributed) | 0.46 [0, 0.92] | 1.98 | 0.05* |
| Number of Sentence Repetitions | 0.02 [-0.02, 0.06] | 0.83 | 0.41 |
| Mean Age | 0 [-0.001, 5e-04] | -0.59 | 0.55 |
| Parameter | Estimates | z value | p value |
|---|---|---|---|
| Intercept | -0.24 [-2.19, 1.72] | -0.24 | 0.81 |
| Character Identification Phase (Yes / No) | 0.33 [-0.76, 1.43] | 0.60 | 0.55 |
| Practice Phase (Yes / No) | 0.01 [-1.05, 1.07] | 0.02 | 0.98 |
| Stimuli Synchronicity (Simultaneous / Asynchronous) | 0.17 [-1.14, 1.48] | 0.25 | 0.8 |
| Testing Structure (Mass / Distributed) | 0.95 [0.23, 1.67] | 2.59 | 0.01* |
| Number of Sentence Repetitions | 0.01 [-0.1, 0.12] | 0.14 | 0.89 |
| Median Productive Vocabulary Size | -0.01 [-0.03, 0.01] | -0.74 | 0.46 |
| Parameter | Estimates | z value | p value |
|---|---|---|---|
| Intercept | -0.4 [-1.01, 0.21] | -1.29 | 0.2 |
| Character Identification Phase (Yes / No) | 0.28 [-0.22, 0.79] | 1.10 | 0.27 |
| Practice Phase (Yes / No) | -0.06 [-0.36, 0.24] | -0.41 | 0.68 |
| Stimuli Synchronicity (Simultaneous / Asynchronous) | 0.27 [-0.25, 0.78] | 1.02 | 0.31 |
| Testing Structure (Mass / Distributed) | 0.53 [0.08, 0.98] | 2.32 | 0.02* |
| Number of Sentence Repetitions | 0.02 [-0.02, 0.06] | 1.12 | 0.26 |
| Sentence Structure (Transitive / Intransitive) | 0.28 [0.03, 0.52] | 2.24 | 0.03* |
| Parameter | Estimates | z value | p value |
|---|---|---|---|
| Intercept | -0.59 [-1.27, 0.1] | -1.67 | 0.09 |
| Character Identification Phase (Yes / No) | 0.27 [-0.2, 0.75] | 1.13 | 0.26 |
| Practice Phase (Yes / No) | -0.2 [-0.5, 0.09] | -1.37 | 0.17 |
| Stimuli Synchronicity (Simultaneous / Asynchronous) | 0.61 [0.01, 1.21] | 2.00 | 0.05* |
| Testing Structure (Mass / Distributed) | 0.08 [-0.5, 0.66] | 0.28 | 0.78 |
| Number of Sentence Repetitions | 0.04 [0, 0.08] | 1.95 | 0.05* |
| Agent Argument Type (Pronoun / Nouns) | 0.55 [-0.05, 1.16] | 1.80 | 0.07 |
*The moderators’ estimates are for the first level in the parenthesis.
In the main analysis, we presented the results of the model for the relationship between effect size and the agent argument type. We found that having nouns or pronouns int he agent argument does not significantly predict the effect size. Here, we presented a similar analysis of the influence of the patient argument type. Because by definition English intransitive sentences do not have patient argument, we focus on the subset of studies that used the transitive sentences (\(N\) = 30)
| Parameter | Estimate | z value | p value |
|---|---|---|---|
| Intercept | 0.33 [0.08, 0.58] | 2.59 | 0.01 |
| Patient Argument Type (Pronoun / Noun) | -0.03 [-0.47, 0.42] | -0.11 | 0.91 |
We found that the presentation modality of the stimuli was not a significant predictor of the effect size. In other words, studies that presented young children with animation clips had similar effect sizes as studies using video clips. The model statistics are shown below. Note that the stimuli modality and the stimuli actor levels had a lot of overlapping studies, so researchers should interpret this result with caution.
| Parameter | Estimate | z value | p value |
|---|---|---|---|
| Intercept | 0.6 [0.13, 1.07] | 2.52 | 0.01 |
| Stimuli Modality (Video / Animation) | -0.37 [-0.82, 0.08] | -1.62 | 0.11 |
Similarly, we did not find an effect of stimuli actors. Studies with human actors as protagonists in the events had similar effect sizes as studies using puppets, human actors in animal suits, or using animated geometrical shapes.
| Parameter | Estimate | z value | p value |
|---|---|---|---|
| Intercept | 0.44 [0.16, 0.72] | 3.07 | <.001 |
| Stimuli Actor (Person / Non-person) | -0.29 [-0.61, 0.02] | -1.81 | 0.07 |
Studies differed in the type of transitive events and intransitive events they presented. Previous studies have shown that young children’s looking behaviors in Inter-modal Preferential Looking Paradigm were very sensitive to the subtle perceptual differences in the visual stimuli (Delle Luche, Durrant, Poltrock, & Floccia, 2015; Fernald, Zangl, Portillo, & Marchman, 2008). Therefore, we coded the types of events presented in the visual stimuli. There were two types of transitive events: direct causal action and indirect causal action. The former involved the agent directly acting on the patient and causing the patient to move. The latter involved a mean-end sequence leading to the caused action of the patient. For example, the agent may pull a band on the patient’s waist and caused it to move. There were also two types of intransitive events used in the literature. One involved a single actor acting, such as jumping up and down. The other involved two actors presented without any causal action.
Our model suggested that neither of the variables was predictive of the effect sizes.
| Parameter | Estimate | z value | p value |
|---|---|---|---|
| Intercept | 0.11 [-0.32, 0.54] | 0.52 | 0.61 |
| Transitive Event Type (Indirect caused action / Direct caused action) | 0.2 [-0.3, 0.7] | 0.79 | 0.43 |
| Parameter | Estimate | z value | p value |
|---|---|---|---|
| Intercept | 0.32 [-0.02, 0.66] | 1.82 | 0.07 |
| Intransitive Event Type (Parallel actions / One action) | -0.08 [-0.42, 0.26] | -0.45 | 0.65 |
There was some evidence for researchers adapting the level of visual complexity in the visual stimuli according to children’s age. We collected the available visual stimuli from the papers and the supporting materials. Schematic illustrations of the visual stimuli were used when the actual screenshots were not provided. Screenshots of the text descriptions of the events were used when the visual stimuli were unavailable. Note that because some papers’ publishers converted to the visual stimuli to black-and-white, we decided to grayscale all visual stimuli for easier visual comparison.
It is easy to see in the plot that studies for particularly young children used significantly simpler visual stimuli. This adaptation might be partly responsible for the lack of age effect observed in our samples.
References
Delle Luche, C., Durrant, S., Poltrock, S., & Floccia, C. (2015). A methodological investigation of the Intermodal Preferential Looking paradigm: Methods of analyses, picture selection and data rejection criteria. Infant Behavior and Development, 40, 151-172
Fernald, A., Zangl, R., Portillo, A. L., & Marchman, V. A. (2008). Looking while listening: Using eye movements to monitor spoken language. Developmental psycholinguistics: On-line methods in children’s language processing, 44, 97.